线性函数在进化算法的运行时分析中起关键作用,研究为分析进化计算方法提供了广泛的新见解和技术。通过对可分离功能的研究和进化算法的优化行为以及来自机会约束优化领域的目标函数的优化行为,我们研究了两个转换线性函数的加权总和的目标函数类别。我们的结果表明,(1+1)EA的突变速率取决于功能的重叠位数,在预期时间O(n log n)中为这些函数获得了最佳解决方案,从而推广了一个众所周知的。线性函数的结果范围更广泛。
translated by 谷歌翻译
机会受到限制的优化问题允许建模问题,其中涉及随机组件的约束仅应以较小的概率侵犯。进化算法已应用于这种情况,并证明可以实现高质量的结果。在本文中,我们有助于对进化算法的理论理解,以进行偶然的优化。我们研究独立且正态分布的随机组件的场景。考虑到简单的单对象(1+1)〜EA,我们表明,施加额外的统一约束已经导致局部最佳选择,对于非常有限的场景和指数优化时间。因此,我们引入了问题的多目标公式,该公式可以摆脱预期成本及其差异。我们表明,在使用此公式时,多目标进化算法是非常有效的,并获得一组解决方案,该解决方案包含最佳解决方案,以适用于施加在约束上的任何可能的置信度。此外,我们证明这种方法还可以用于计算一组最佳解决方案,以限制最小跨越树问题。为了在多目标配方中呈指数指数的折衷,我们提出并分析了改进的凸多目标方法。关于NP-固定随机最小重量占主导地位问题的实例的实验研究证实了多目标和改进的凸多目标方法的益处。
translated by 谷歌翻译
Neural networks are increasingly applied in safety critical domains, their verification thus is gaining importance. A large class of recent algorithms for proving input-output relations of feed-forward neural networks are based on linear relaxations and symbolic interval propagation. However, due to variable dependencies, the approximations deteriorate with increasing depth of the network. In this paper we present DPNeurifyFV, a novel branch-and-bound solver for ReLU networks with low dimensional input-space that is based on symbolic interval propagation with fresh variables and input-splitting. A new heuristic for choosing the fresh variables allows to ameliorate the dependency problem, while our novel splitting heuristic, in combination with several other improvements, speeds up the branch-and-bound procedure. We evaluate our approach on the airborne collision avoidance networks ACAS Xu and demonstrate runtime improvements compared to state-of-the-art tools.
translated by 谷歌翻译
Insects as pollinators play a key role in ecosystem management and world food production. However, insect populations are declining, calling for a necessary global demand of insect monitoring. Existing methods analyze video or time-lapse images of insects in nature, but the analysis is challenging since insects are small objects in complex and dynamic scenes of natural vegetation. The current paper provides a dataset of primary honeybees visiting three different plant species during two months of summer-period. The dataset consists of more than 700,000 time-lapse images from multiple cameras, including more than 100,000 annotated images. The paper presents a new method pipeline for detecting insects in time-lapse RGB-images. The pipeline consists of a two-step process. Firstly, the time-lapse RGB-images are preprocessed to enhance insects in the images. We propose a new prepossessing enhancement method: Motion-Informed-enhancement. The technique uses motion and colors to enhance insects in images. The enhanced images are subsequently fed into a Convolutional Neural network (CNN) object detector. Motion-Informed-enhancement improves the deep learning object detectors You Only Look Once (YOLO) and Faster Region-based Convolutional Neural Networks (Faster R-CNN). Using Motion-Informed-enhancement the YOLO-detector improves average micro F1-score from 0.49 to 0.71, and the Faster R-CNN-detector improves average micro F1-score from 0.32 to 0.56 on the our dataset. Our datasets are published on: https://vision.eng.au.dk/mie/
translated by 谷歌翻译
Reliable application of machine learning-based decision systems in the wild is one of the major challenges currently investigated by the field. A large portion of established approaches aims to detect erroneous predictions by means of assigning confidence scores. This confidence may be obtained by either quantifying the model's predictive uncertainty, learning explicit scoring functions, or assessing whether the input is in line with the training distribution. Curiously, while these approaches all state to address the same eventual goal of detecting failures of a classifier upon real-life application, they currently constitute largely separated research fields with individual evaluation protocols, which either exclude a substantial part of relevant methods or ignore large parts of relevant failure sources. In this work, we systematically reveal current pitfalls caused by these inconsistencies and derive requirements for a holistic and realistic evaluation of failure detection. To demonstrate the relevance of this unified perspective, we present a large-scale empirical study for the first time enabling benchmarking confidence scoring functions w.r.t all relevant methods and failure sources. The revelation of a simple softmax response baseline as the overall best performing method underlines the drastic shortcomings of current evaluation in the abundance of publicized research on confidence scoring. Code and trained models are at https://github.com/IML-DKFZ/fd-shifts.
translated by 谷歌翻译
Pre-trained protein language models have demonstrated significant applicability in different protein engineering task. A general usage of these pre-trained transformer models latent representation is to use a mean pool across residue positions to reduce the feature dimensions to further downstream tasks such as predicting bio-physics properties or other functional behaviours. In this paper we provide a two-fold contribution to machine learning (ML) driven drug design. Firstly, we demonstrate the power of sparsity by promoting penalization of pre-trained transformer models to secure more robust and accurate melting temperature (Tm) prediction of single-chain variable fragments with a mean absolute error of 0.23C. Secondly, we demonstrate the power of framing our prediction problem in a probabilistic framework. Specifically, we advocate for the need of adopting probabilistic frameworks especially in the context of ML driven drug design.
translated by 谷歌翻译
传统上,无监督的情感分析是通过计算存储在情感词典中的文本中的这些词,然后根据注册正面和否定词的比例分配标签的文字来执行的。尽管这些“计数”方法被认为是有益的,因为它们确定性地对文本进行评分,但当分析的文本简短或词汇与词典认为默认值的情况不同时,它们的分类率降低。本文提出的称为LEX2SENT的模型是一种无监督的情感分析方法,用于改善情感词典方法的分类。为此,对DOC2VEC模型进行了训练,以确定嵌入文档嵌入与情感词典正面和负部分的嵌入之间的距离。然后对这些距离进行评估,以在重新采样文档上多次执行DOC2VEC,并进行平均以执行分类任务。对于本文考虑的三个基准数据集,拟议的LEX2SENT优于每个评估的词典,包括Vader等最先进的词典或分类率的意见词典。
translated by 谷歌翻译
如何将新兴和全面的技术(例如AI)整合到我们社会的结构和运营中是当代政治,科学和公众辩论的问题。它从不同学科中产生了大量的国际学术文献。本文分析了有关人工智能调节(AI)的学术辩论。该系统审查包括在2016年1月1日至2020年12月31日之间发表的73份同行评审期刊文章样本。分析集中于社会风险和危害,监管责任问题以及可能基于风险的政策框架在内和基于原则的方法。主要利益是拟议的监管方法和工具。提出了各种形式的干预措施,例如禁令,批准,标准设定和披露。对所包括论文的评估​​表明该领域的复杂性,这表明其早产和剩余的缺乏清晰度。通过对学术辩论进行结构性分析,我们在经验和概念上均可更好地理解AI和监管的联系以及基本规范性决策。科学建议与拟议的欧洲AI调节的比较说明了调节的特定方法,其优势和缺点。
translated by 谷歌翻译
扩散模型是一类生成模型,与其他生成模型相比,在自然图像数据集训练时,在创建逼真的图像时表现出了出色的性能。我们引入了Dispr,这是一个基于扩散的模型,用于解决从二维(2D)单细胞显微镜图像预测三维(3D)细胞形状的反问题。使用2D显微镜图像作为先验,因此可以根据预测现实的3D形状重建条件。为了在基于功能的单细胞分类任务中展示DIPPR作为数据增强工具的适用性,我们从分组为六个高度不平衡类的单元中提取形态特征。将DISPR预测的功能添加到三个少数类别,将宏F1分数从$ f1_ \ text {macro} = 55.2 \ pm 4.6 \%$ to $ f1_ \%$ to $ f1_ \ text {macro} = 72.2 \ pm 4.9 \%$。由于我们的方法是在这种情况下第一个采用基于扩散的模型的方法,因此我们证明了扩散模型可以应用于3D中的反问题,并且他们学会了从2D显微镜图像中重建具有现实的形态特征的3D形状。
translated by 谷歌翻译
我们介绍了一个新颖的联合学习框架FedD3,该框架减少了整体沟通量,并开放了联合学习的概念,从而在网络受限的环境中进行了更多的应用程序场景。它通过利用本地数据集蒸馏而不是传统的学习方法(i)大大减少沟通量,并(ii)将转移限制为一击通信,而不是迭代的多路交流来实现这一目标。 FedD3允许连接的客户独立提炼本地数据集,然后汇总那些去中心化的蒸馏数据集(通常以几个无法识别的图像,通常小于模型小于模型),而不是像其他联合学习方法共享模型更新,而是允许连接的客户独立提炼本地数据集。在整个网络上仅一次形成最终模型。我们的实验结果表明,FedD3在所需的沟通量方面显着优于其他联合学习框架,同时,根据使用情况或目标数据集,它为能够在准确性和沟通成本之间的权衡平衡。例如,要在具有10个客户的非IID CIFAR-10数据集上训练Alexnet模型,FedD3可以通过相似的通信量增加准确性超过71%,或者节省98%的通信量,同时达到相同的准确性与其他联合学习方法相比。
translated by 谷歌翻译